variational energy
Robust volatility updates for Hierarchical Gaussian Filtering
Mathys, Christoph, Legrand, Nicolas, Waade, Peter Thestrup, Mikus, Nace, Weber, Lilian Aline
Hierarchical Gaussian Filtering (HGF) networks allow for efficient updating of posterior distributions (beliefs) about hidden states of an agent's environment. HGF parent nodes can target the mean or variance of their children. New information entering at input nodes leads to a cascade of belief updates across the network according to one-step update equations for each node's mean and precision (inverse variance). However, the original form of the update equations for variance-targeting parents(volatility coupling) can in some regions of parameter space lead to negative posterior precision, a logical impossibility which causes the updating algorithm to terminate with an error. In this report, we introduce a modified quadratic approximation to the variational energy of volatility-coupled nodes that avoids negative posterior precision. The key idea is to interpolate between two quadratic expansions of the variational energy: one at the prior prediction and one at a second mode whose location is obtained in closed form via the Lambert W function. The resulting update equations are robust across the entire parameter space and faithfully track the variational posterior even for large prediction errors.
Mitigating the Curse of Detail: Scaling Arguments for Feature Learning and Sample Complexity
Rubin, Noa, Davidovich, Orit, Ringel, Zohar
Two pressing topics in the theory of deep learning are the interpretation of feature learning mechanisms and the determination of implicit bias of networks in the rich regime. Current theories of rich feature learning, often appear in the form of high-dimensional non-linear equations, which require computationally intensive numerical solutions. Given the many details that go into defining a deep learning problem, this complexity is a significant and often unavoidable challenge. Here, we propose a powerful heuristic route for predicting the data and width scales at which various patterns of feature learning emerge. This form of scale analysis is considerably simpler than exact theories and reproduces the scaling exponents of various known results. In addition, we make novel predictions on complex toy architectures, such as three-layer non-linear networks and attention heads, thus extending the scope of first-principle theories of deep learning.
Hybrid Ground-State Quantum Algorithms based on Neural Schr\"odinger Forging
de Schoulepnikoff, Paulin, Kiss, Oriel, Vallecorsa, Sofia, Carleo, Giuseppe, Grossi, Michele
Entanglement forging based variational algorithms leverage the bi-partition of quantum systems for addressing ground state problems. The primary limitation of these approaches lies in the exponential summation required over the numerous potential basis states, or bitstrings, when performing the Schmidt decomposition of the whole system. To overcome this challenge, we propose a new method for entanglement forging employing generative neural networks to identify the most pertinent bitstrings, eliminating the need for the exponential sum. Through empirical demonstrations on systems of increasing complexity, we show that the proposed algorithm achieves comparable or superior performance compared to the existing standard implementation of entanglement forging. Moreover, by controlling the amount of required resources, this scheme can be applied to larger, as well as non permutation invariant systems, where the latter constraint is associated with the Heisenberg forging procedure. We substantiate our findings through numerical simulations conducted on spins models exhibiting one-dimensional ring, two-dimensional triangular lattice topologies, and nuclear shell model configurations.
An Empirical Study of Quantum Dynamics as a Ground State Problem with Neural Quantum States
Vargas-Calderón, Vladimir, Vinck-Posada, Herbert, González, Fabio A.
A central problem of quantum physics, be it fundamental quantum physics or applications for quantum technology, is the ground state problem. It can be defined as finding a state vector |Ψ that minimises the expected value of the Hamiltonian Ĥ that represents the energetic interactions between the different parts that make up a quantum physical system. It is well-known that the difficulty of solving the ground state problem for a physical system arises from the exponential growth of the Hilbert space with respect to the number of the system components and their dimension. Therefore, techniques such as exact diagonalisation of Ĥ quickly render insufficient to find the ground state, and other approximate methods have to be used. Interestingly, other central problems of quantum physics such as finding the evolution of a quantum system can be cast into the ground state problem, as demonstrated by the Feynman-Kitaev formalism [24]. An immediate implication of using this formalism is that the computational tools historically developed for solving the ground state problem can be used to find the dynamics of a physical system. Broadly speaking, the Feynman-Kitaev formalism appends a clock as an auxilliary subsystem of the main physical system, i.e. the Hilbert space H of the whole system is H = P C, where P is the Hilbert space of the main physical system and C is the Hilbert space of the clock.
Transfer learning enhanced physics informed neural network for phase-field modeling of fracture
Goswami, Somdatta, Anitescu, Cosmin, Chakraborty, Souvik, Rabczuk, Timon
We present a new physics informed neural network (PINN) algorithm for solving brittle fracture problems. While most of the PINN algorithms available in the literature minimize the residual of the governing partial differential equation, the proposed approach takes a different path by minimizing the variational energy of the system. Additionally, we modify the neural network output such that the boundary conditions associated with the problem are exactly satisfied. Compared to conventional residual based PINN, the proposed approach has two major advantages. First, the imposition of boundary conditions is relatively simpler and more robust. Second, the order of derivatives present in the functional form of the variational energy is of lower order than in the residual form. Hence, training the network is faster. To compute the total variational energy of the system, an efficient scheme that takes as input a geometry described by spline based CAD model and employs Gauss quadrature rules for numerical integration has been proposed. Moreover, we utilize the concept of transfer learning to obtain the crack path in an efficient manner. The proposed approach is used to solve four fracture mechanics problems. For all the examples, results obtained using the proposed approach match closely with the results available in the literature. For the first two examples, we compare the results obtained using the proposed approach with the conventional residual based neural network results. For both the problems, the proposed approach is found to yield better accuracy compared to conventional residual based PINN algorithms.
Cluster Variational Approximations for Structure Learning of Continuous-Time Bayesian Networks from Incomplete Data
Linzner, Dominik, Koeppl, Heinz
Continuous-time Bayesian networks (CTBNs) constitute a general and powerful framework for modeling continuous-time stochastic processes on networks. This makes them particularly attractive for learning the directed structures among interacting entities. However, if the available data is incomplete, one needs to simulate the prohibitively complex CTBN dynamics. Existing approximation techniques, such as sampling and low-order variational methods, either scale unfavorably in system size, or are unsatisfactory in terms of accuracy. Inspired by recent advances in statistical physics, we present a new approximation scheme based on cluster-variational methods significantly improving upon existing variational approximations. We can analytically marginalize the parameters of the approximate CTBN, as these are of secondary importance for structure learning. This recovers a scalable scheme for direct structure learning from incomplete and noisy time-series data. Our approach outperforms existing methods in terms of scalability.
The variational Laplace approach to approximate Bayesian inference
Variational approaches to approximate Bayesian inference provide very efficient means of performing parameter estimation and model selection. Among these, so-called variational-Laplace or VL schemes rely on Gaussian approximations to posterior densities on model parameters. In this note, we review the main variants of VL approaches, that follow from considering nonlinear models of continuous and/or categorical data. En passant, we also derive a few novel theoretical results that complete the portfolio of existing analyses of variational Bayesian approaches, including investigations of their asymptotic convergence. We also suggest practical ways of extending existing VL approaches to hierarchical generative models that include (e.g., precision) hyperparameters.